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ABSTRACT 



The National Assessment of Educational Progress (NAEP) has 
always included students with disabilities and limited English proficient 
(LEP) students in the sample to be assessed, but only in relatively limited 
numbers. Recent research has indicated that many students who have been 
excluded are in fact capable of participating. The National Center for 
Education Statistics has field tested the use of revised criteria for 
deciding whether students should participate in the assessment and various 
accommodations and adaptations to remove barriers to participation. Inclusion 
of more students with disabilities and LEP students was studied in three 
samples for the national NAEP mathematics and science assessments and in the 
state NAEP assessments in mathematics and science. Students with disabilities 
and LEP students were oversampled in the national assessments but not in the 
state assessments. Students with disabilities in one sample from the national 
assessments were offered various accommodations to remove barriers to their 
participation. Changes have been incorporated into the 1996 NAEP to further 
the goal of maximum inclusion of students with disabilities and LEP students, 
while maintaining trend measurements from past assessments and continuing to 
report on the academic performance of all students in the nation. These 
changes will result in an improved and more representative national 
assessment program and will benefit states and school districts as the NAEP 
often serves as a model for the best practice in large-scale assessment. 
(Contains one figure.) ( SLD) 
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The National Assessment of Educational Progress (NAEP) has always tried to be 
representative of the academic performance of all students in the nation. In the past, 
students with disabilities and limited English proficient (LEP) students were often 
excluded from NAEP. In the case of many students with disabilities, their Individualized 
Education Programs (IEPs) stipulated they were not to be tested. In other instances, 
school staff or parents believed these students were unable to participate due to their 
perceived limitations. Other students might have been able to take the assessment if 
certain modifications or accommodations were made to the testing environment to 
remove barriers to their participation, but, prior to the 1 996 NAEP, these 
accommodations were not available. And some students were excluded by their schools 
because it was feared their presumably lower performance would decrease the school, 
district, or state average score. 

NAEP has always included students with disabilities and LEP students in the 

sample selected to be assessed. In 1 994, however, about one-third to half (depending on 

the grade) of the students with disabilities in the sample actually participated in the 

assessment, and about half to two-thirds of the LEP students (depending on the grade) 

participated. 

* 
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Recent research has indicated that many students with disabilities and LEP 
students who were excluded from NAEP were in fact capable of participating in the 
assessment, especially if the criteria for including students were changed and if certain 

4 

accommodations and adaptations were offered. 1 This research suggested that about 70 
percent of the excluded 4 tb -grade students with disabilities and a sizable proportion of 
excluded LEP students Could have participated in NAEP. 

With the encouragement and counsel of various offices in the Department of 
Education, as well as nongovernmental organizations interested in the inclusion issue, in 
1995 the National Center for Education Statistics (NCES) field-tested the use of revised 
criteria for deciding whether students should participate in the assessment, and various 
accommodations and adaptations to remove barriers to participation. Results indicated 
that these procedures could be implemented successfully in the full 1996 national 
assessments of mathematics and science. 

The NAEP field test also raised certain methodological problems which only a 
fuller assessment could help to answer. These issues included: 

• How could trends be measured from past assessments if procedures changed and 
greater proportions of students with disabilities and LEP students were assessed than 
in the past? 



1 Stancavage, F., et al. “Study of Exclusion and Assessability of Students with Disabilities in the 1994 Trial 
State Assessment of the National Assessment of Educational Progress.” Also Stancavage, F., et al. “Study 
of Exclusion and Assessability of Students with Limited English Proficiency in the 1994 Trial State 
Assessment of the National Assessment of Educational Progress.” In Quality and Utility: the 1994 Trial 
State Assessment in Reading, Background Studies. Stanford, CA: National Academy of Education, 1996. 



• What would be the impact on the assessment of the changes in inclusion criteria and 
presence of accommodations? 

• Would it be possible to include the results of accommodated students in the overall 
results, i.e., would those students perform on the assessment in a way similar to other 
students? If not, the responses of accommodated students could not be scaled along 
with responses from students assessed under nonaccommodated conditions. 

To answer these questions, and to perform the main task of measuring what the nation’s 
students know and can do, the 1 996 NAEP design included a research component. 
Sample Design for the 1996 NAEP Mathematics Assessment — National Level 

In 1996, mathematics and science were assessed nationally at the 4 th , 8 th , and 12 th 
grades. In states that elected to participate, mathematics was assessed in the 4 th and 8 th • 
grades, and science was assessed at the 8 th grade. 2 In the national-level mathematics 
assessment, three roughly equal subsamples were created (n=3,500 approximately). 
Students with disabilities and LEP students were oversampled in all three subsamples. 
Figure 1 illustrates this sample design. 



2 At the 4 th grade, 47 states and other jurisdictions participated; 44 jurisdictions participated at the 8 th grade. 
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Figure 1 

Sample Design for 1996 NAEP Mathematics Assessment 



SI: 1992 Inclusion 
Criteria without 
Accomodations 



S2: \99S Inclusion 
Criteria without 
Accomodations 



S3: 1996 Inclusion 
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NAEP results are often used to measure the progress of students from past 
assessments. The 1996 mathematics results, for example, can be compared to those for 
1992 and 1990. Such comparison, however, requires that testing conditions and the 
characteristics of the student sample remain equivalent over the assessments to be 
compared. Thus, it was necessary to assess at least part of the 1996 national sample using 
the same inclusion criteria as were used in 1990 and 1992, and accommodations could 
not be offered to students in that subsample, which was called “SI .” By comparing 
results from SI with those from the previous assessments, trends in performance could be 
presented. 
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The second subsample, designated “S2,” was assessed using revised inclusion 
criteria for students with disabilities and for LEP students. No accommodations were 
offered to this subsample, however, so that the effect of the changes in criteria on 
participation and performance could be evaluated. The previously used criteria 
emphasized exclusion of students, whereas the revised criteria emphasized inclusion. For 
students with disabilities, the previous criteria instructed schools to exclude students from 
NAEP if they were “mainstreamed” less than 50 percent of the time in academic subjects, 
or if they were judged to be incapable of participating “meaningfully” in the assessment. 
The revised criteria for 1996 left less room for judgment on the part of school staff, and 
instructed schools to include students with disabilities unless their IEP stipulated they 
should not participate, or the students’ IEP teams determined they could not participate, 
or if a student’s cognitive functioning was so severely impaired that the student could not 
participate. 

For LEP students, the previously used criteria instructed schools to exclude 
students if their native language was other than English, and they were enrolled in an 
English-speaking school (not including a bilingual education program) less than two 
years, and they were judged incapable of taking part in the assessment. The revised 1996 
criteria instructed the school to include the students if they had received instruction in 
English for at least three years, or if the staff determined the students could participate in 
English even though they had received instruction in English less than three years. 

The inclusion criteria for LEP students were changed because the old criteria were 
not working well. The staff of the Education Department’s Office of Bilingual Education 
and Minority Language Affairs (OBEMLA), for example, pointed out that the term 
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“English-speaking school” did not necessarily mean that all its students were receiving 
instruction in English. All schools in the national sample are English-speaking. The 
OBEMLA staff also felt that two years of English instruction is not sufficient for a 
student to participate meaningfully, and advised that states with high concentrations of 
LEP students, such as California, Texas, Florida, Michigan, and Arizona, use three years 
of English instruction as their criterion for inclusion in assessments. While the 1996 
criterion of three years is more restrictive than the previously used two-year threshold, 
bilingual educators contend that the new criteria are more appropriate and consistent with 
research studies in this area. 

The students with disabilities and the LEP students in the third subsample in the 
1996 NAEP in mathematics, designated “S3,” were offered various accommodations and 
adaptations. By restricting these modifications to students in S3, it is possible to compare 
the results obtained under nonstandard testing conditions to those assessed under standard 
conditions, both with the new and old criteria. 

The presence of accommodations in S3 meant that the criteria for inclusion were 
slightly different from those used in S2. Students with disabilities and LEP students were 
to be included in the assessment if they could participate with the available 
accommodations. The accommodations are described in a later section of this paper. 

Sample Design for the 1996 NAEP Science Assessment — National Level 

The science assessment presented a simpler sample design problem. There was 
no need to measure the trend from previous assessments because the 1 996 assessment 
was based on a new framework. While there was a science assessment in 1990, the 1996 
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assessment tested different content, incorporated more hands-on experiments, and placed 
greater emphasis on extended-response items, in which students are asked to explain their 
solutions and reasoning. These changes prevent meaningful comparisons with results 
from the earlier assessment. 

With no trend to be measured in the science assessment, S 1 could be dropped. 
There was no need to recreate the conditions of the previous assessment. Only S2, in 
which the new inclusion criteria were used, and S3, in which accommodations were 
offered as well, were needed. 

Sample Design for State Assessments in Mathematics and Science 

In addition to a national sample, as mentioned above, NAEP also assesses 
representative samples in states that elect to participate. In 1996, mathematics was 
assessed at grades 4 and 8, and science was assessed at grade 8. 

No accommodations were offered in the state assessments. The states themselves 
are responsible for administering NAEP at the state level. It was decided not to impose 
the burden of providing accommodations on the states in 1996. States had already agreed 
to participate without being asked provide accommodations. Additionally, there were 
concerns that the accommodations might not be administered in a uniform manner across 
all states, thus losing comparability of results. Thus, there was no S3 at the state level. 

Both the mathematics and science assessments in the states were conducted using 
subsamples SI and S2. This strategy permitted measuring trends from previous 
mathematics assessments, as SI students were assessed under the same conditions and 
using the same inclusion criteria as in 1990 and 1992. While there was no trend to 
maintain in science, administering the science assessment to half the schools in the states 



using the previous inclusion criteria provided further evidence of the effect of the new 
inclusion criteria when compared with results from the S2 subsample, and in comparison 
to national-level results. There was no oversampling of students with disabilities or LEP 
students at the state level. 

Accommodations Provided to Students with Disabilities 

Students with disabilities in the national S3 sample were offered various 
accommodations to remove barriers to their participation. These students were given 
whatever accommodations they received as part of their Individualized Education 
Program. The accommodation requested most often was extended time. Since NAEP is 
not a “speed” test, it was believed this accommodation would not make the assessment 
easier for students who used it, but it would allow those with disabilities who need more 
time to better show their knowledge. Other accommodations included one-on-one 
testing, help with directions, reading items aloud, signing for the deaf, and provision of 
Braille or large-print versions of the assessment. In science, physically or visually 
handicapped students were exempted from the hands-on tasks if necessary. 

LEP students in S3 in mathematics only were offered a bilingual version of the 
test booklets in facing Spanish and English pages. (Spanish was provided because about 
70 percent of LEP students are Spanish-speaking.) While' no bilingual booklets were 
provided for the science assessment, LEP students were offered Spanish-English 
glossaries and word lists. 

While the results from the 1996 assessment are not yet analyzed, it is likely that 
accommodations will continue to be provided to some extent in future NAEP 



assessments. Many states are also offering accommodations to students with disabilities 
and to LEP students in their own assessments. 

Research Issues 

The main question to be answered by this large NAEP experimental design is 
whether the introduction of revised inclusion criteria and accommodations to testing 
conditions resulted in increased participation of students with disabilities or LEP 
students. Recent data from the 1996 NAEP Mathematics Report Card for the Nation and 
the States 3 indicate that the revised criteria alone did not result in increased participation 
rates, but that the provision of accommodations did result in greater inclusion. 

A major issue is the validity of results for students assessed under nonstandard, or 
accommodated, conditions. Does the assessment measure the same thing when testing 
conditions differ? Are the same skills being measured? Research toward answering these 
questions is being conducted in the analysis of the results from the different subsamples, 
especially the analyses of item-specific statistics for students using the accommodations 
compared to students assessed under standard conditions. 

NCES is striving to determine the best assessment practices, which result in the 
maximum possible inclusion of students with disabilities and LEP students, and that yield 
results that measure what those students know and can do in a valid way. To further this 
objective, NCES is sponsoring several research projects. For example, researchers at the 
National Center for Research on Evaluation, Standards, and Student Testing (CRESST) 
are exploring the role of native language proficiency in the performance of LEP students 

3 Reese, C.M., Miller, K.E., Mazzeo, J.and Dossey, J.A. 1996 NAEP Mathematics Report Card for the 
Nation and the States. Washington, DC: National Center for Education Statistics, 1997, chapter 4. 



in bilingual assessments, and the use of simplified wording and Spanish translations of 
math items for these students. Another study, being conducted by the Council of Chief 
State School Officers (CCSSO), is investigating whether more accurate scoring of 
responses to math and science items from LEP students can be obtained by training 
scorers to recognize the linguistic patterns of syntax and spelling used by non-native 
speakers of English, so they can separate respondents’ content knowledge from their 
English-language skills. 4 
Conclusion 

Changes were incorporated into the 1 996 NAEP to further the goal of maximum 
inclusion of students with disabilities and LEP students, while maintaining trend 
measurements from past assessments and continuing to report on the academic 
performance of all students in the nation in a valid and reliable way. These changes will 
result in an improved and more representative national assessment program. These 
inclusion efforts will also benefit states and school districts, as NAEP often serves as a 
model for the best practice in large-scale assessment. 




4 Inclusion practices in the states and research activities related to inclusion being conducted by federal and 
nonfederal organizations are reviewed in Olson, J. and Goldstein, A. The Inclusion of Students with 
Disabilities and Limited English Proficient Students in Large-Scale Assessments: A Summary of Recent 
Progress. National Center for Education Statistics, forthcoming. 
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